o 

(N 



oo 
o 
oo 
o 



BRASCAMP-LIEB INEQUALITIES FOR 
NON-COMMUTATIVE INTEGRATION 



Eric A. Carlen 1 and Elliott H. Lieb 2 

• 1. Department of Mathematics, Hill Center, 



Rutgers University, 110 Frelinghuysen Road Piscataway NJ 08854-8019 USA 
2. Departments of Mathematics and Physics, Jadwin Hall, 
Princeton University, P. O. Box 708, Princeton, NJ 08544 



o 
O 

(N ! August 4, 2008 



Abstract 



< 

We formulate a non-commutative analog of the Brascamp-Lieb inequality, and prove it in 
several concrete settings. 

C^l ' Mathematics subject classification numbers: 47C15, 15A45, 26D15 

^ ■ Key Words: inequalities, traces, non-commutative integration 

O 



1 Introduction 

1.1 Young's inequality in the context of ordinary Lebesgue integration 

In this paper, we shall extend the class of generalized Young's inequalities known as Brascamp-Lieb 
inequalities (B-L inequalities) to an operator algebra setting entailing non-commutative integration. 
The original Young's inequality [35] states that for non negative measurable functions fi, fa 
C3 ■ and f 3 on R, and 1 < pi,P2,P3 < oo, with 1/pi + l/p 2 + l/p 3 = 2, 



/ 



h(x)f 2 (x - y)h(y)dxdy < U /?' (t)dt) Uf?^) {J R f ^^ dt J ' (L1 



Thus, it provides an estimate of the integral of a product of functions in terms of a product of LP 
norms of these functions. The crucial difference with a Holder type inequality is that the integrals 
on the right are integrals over only R, while the integrals on the left are integrals over R 2 , and 
none of the three factors in the product on the left - f(x), g(x — y) or h{y) - are integrable (to any 
power) on R 2 . 
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To frame the inequality in terms that are more amenable to the generalizations considered here, 
define the maps (pj : R 2 — > R, j = 1, 2, 3, by 

(j) 1 (x,y) = x <p 2 {x,y)=x-y and <p 3 (x,y)=y. 

Then (jl.ip can be rewritten as 

l(n/^^ 2 ^n(/ M /fH I/w - ™ 

There is now no particular reason to limit ourselves to products of only three functions, or to 
integrals over R 2 and R, or even any Euclidean space for that matter: 

1.1 DEFINITION. Given measure spaces (Q,S,n) and [Mj,M.j,Vj), j = 1,...,N, not nec- 
essarily distinct, together with measurable functions (pj : f2 — > Mj and numbers px,...,PN with 
1 < Pj < oo, 1 < j < N, we say that a B-L inequality holds for {(pi, . . . , 4>n} and {p%, . . . ,pn} m 
case there is a finite constant C such that 

„ N N 

holds whenever each fj is non negative and measurable on Mj, j = 1, . . . , N. 

There are by now many examples. One of the oldest is the original discrete Young's inequality. 
In the current notation, this concerns the case in which $1 = Z 2 equipped with counting measure, 
N = 3, and each Mj is Z, equipped with counting measure. Then with 

4>i(m, n) = m <f>2(m,n) = m — n and (f)s(m,n) = n , 

(jl.2p holds for any three non-negative functions fj : Z — > R + under the same conditions on the pj as 
in the continuous case; i.e., l/pi + l/p2 + 1/.P3 = 2. There is a significant difference: In the discrete 
case, the constant C = 1 is sharp, and there is equality if and only if one of the fj is identically 
zero, or else f\ vanishes except at some mo, f3 vanishes except at some no, and f% vanishes except 
at mo — no- The inequality itself is due to Young [38], while the statement about cases of equality 
is proved in [20J, where the authors also consider extensions to more than three functions. 

In the continuous case, a much wider generalization to more than three functions was made by 
B-L in [5], where the sharp constant in Young's inequality - which is strictly less than 1 unless 
Pi = P2 = 1 _ w as obtained, with a proof that the only non-negative functions yielding equality are 
certain Gaussian functions. (This best constant was also obtained at the same time by Beckner [4], 
for three functions.) 

These inequalities generalize from R to R n . The complete generalization to the case in which 
the Mj are all Euclidean spaces, but of different dimension, and the (pj are linear transformations 
from R n to Mj, was proved by Lieb [23]. Again, the maximizers are Gaussians. Another proof 
of this generalized version, together with a reverse form, was obtained by Barthe PQ, who also 
provided a detailed analysis of the cases of equality in the original B-L inequality from [5]. The 
cases of equality in the higher dimensional generalization from [23] were analyzed in detail in [71 [8] . 

Examples in which $7 is the sphere S N_1 or the permutation group S N were proved in |13l I14j . 
and the above definition of B-L inequalities in arbitrary measure spaces is taken from [ID], where 
a duality between B-L inequalities and subadditivity of entropy inequalities is proved. 
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1.2 A generalized Young's inequality in the context of non commutative inte- 
gration 

In non commutative integration theory, as developed by Irving Segal [301 ED E2] , the basic frame- 
work is a triple (TC, 21, A) where TC is a Hilbert space, 21 is a W* algebra (a von Neumann algebra) 
of operators on TC, and A is a positive linear functional on the finite rank operators in 21. In Segal's 
picture, the algebra 21 corresponds to the algebra of bounded measurable functions, and applying 
the linear positive linear functional A to a positive operator corresponds to taking the integral of a 
positive function. That is, 

A i — > A(A) corresponds to / i— > / fdis . 

J M 

Such a triple (ft, 21, A) is called a non commutative integration space. Certain natural regularity 
properties must be imposed on A if one is to get a well-behaved non-commutative integration theory, 
but we shall not go into these here as the examples that we consider are all based on the case in 
which A is the trace on operators on ft, or some closely related functional, for which discussion of 
these extra conditions would be a digression. 

In this operator algebra setting, there are natural non-commutative analogs of the usual LP 
spaces: If A is a finite rank operator in 21, and 1 < q < oo, define 

\\A\\ q>x = (\(A*Ayi 2 ) l/q . 

This defines a norm (under appropriate conditions on A that are obvious for the trace), and the 
completion of the space of finite rank operator in 21 under this norm defines a non-commutative 
LP space. (The completion may contain unbounded operators "affiliated" to 21.) For more on the 
general theory of non-commutative integration, see the early papers [I5j EOJ E21 [33] and the more 
recent work in [T6| [T91I2T1 [25] . 

To frame an analog of (II. 3D in an operator algebra setting, we replace the measure spaces by 
non commutative integration spaces: 

(n,S,n) — ► (ft, 21, A) and (Mj,Mj,Uj) — >(TC j ,%,X j ) j = l,...,N. 

The right hand side of fjl .3[> has an obvious generalization to the operator algebra setting in terms 
of the non-commutative L p norms introduced above. 

As for the left hand side of (jl.3p . regard i— > /,• o <f>j as a W* algebra homomorphism (which, 
restricted to the W* algebra L°°(Mj), it is), and suppose we are given W* homomorphisms 

4>j : % 2t . 

Then each cj)j(Aj) belongs to 21, however in the non-commutative case, the product of the cj)j(Aj) 
depends on their order in the product, and need not be self-adjoint even - let alone positive - even 
if each of the Aj are positive. 

Therefore, let us return to the left side of (II. 3j) . and suppose that each fj is strictly positive. 
Then defining 

hj = ln(/j-) so that fj o ^ = e ho ^ , 
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we can then rewrite (jl.3p as 



cxp 




(1.4) 



We can now formulate our operator algebra analog of (jl.3p : 



1.2 DEFINITION. Given non commutative integration spaces (Ti, 21, A) and (Hj,2lj, j = 
1, . . . ,N, together with W* algebra homomorphisms (ftj : 2lj — ► 21, j = 1, . . . , N, and indices 1 < 
Pj < oo, j = 1, . . . , N, a non- commutative B-L inequality holds for {cfii, . . . , 0jv} and {pi, . . . ,pn} 
if there is a finite constant C so that 







" N 






A ^exp 




j <Cn(Aiexpbiffi]) 1M 










whenever Hj is self- adjoint in 2lj, j = 1, . . . , N. 



(1.5) 



In this paper, we are concerned with determining the indices and the best constant C for which 
such an inequality holds, and shall focus on two examples: The first concerns operators on tensor 
products of Hilbert spaces, and the second concerns Clifford algebras. 



1.3 A generalized Young's inequality for tensor products 

1.3 EXAMPLE. Let Hj, j = l,...,n be separable Hilbert spaces, and let Let K, denote the 
tensor product 

K = Hi ® • • • (» H n ■ 

Define 21 to be 2$(/C), the algebra of bounded operators on /C, and define A to be Tr, the trace 
Tr on K, so that (H, 21, A) = (XT, ®(/C), Tr). 

For any non empty subset J of {1, . . . , n}, let JCj denote the tensor product 

Define 21 j to be 25 (/Cj), the algebra of bounded operators on /Cj, and define Aj be Trj, the trace 
on Kj, so that {Hj,2Lj,\j) = (/Cj, «8(/Cj), Trj). 

There are natural homomorphisms <j>j embedding the 2 n — 1 algebras 2lj into 21. For instance, 
if J = {1,2}, 

<t>{i,2}{A x ® A 2 ) = A x ® A 2 ® ■ ■ ■ ® I Hn , (1.6) 

and is extended linearly. 

It is obvious that in case JnK = and JUK = {1, . . . , n}, then for all Hj € 2lj and Hk G 21^, 

Tr (e^+^) = Trj (e^) Tr^ (e^) , (1.7) 

but things are more interesting when JDK ^ and J and are both proper subsets of {1, . . . , n}. 
If Hj and Hk do not commute, which is the typical situation for J n K ^ 0, one can estimate the 
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left hand side of (|1.7p by first applying the Golden-Thompson inequality [T71 [M] , which says that 
for self-adjoint operators Hj and Hk, 

Tr (e Hj+HK ) < Tr (e^e^) . 

One might then apply Holder's inequality - but if J and K are proper subsets of {1, . . . , n}, this 
will yield a finite bound if and only if all of the Hilbert spaces whose indices are not included in both 
J and K are finite dimensional. Even then, the bound depends on the dimension in an unpleasant 
way. The non-commutative B-L Inequalities provided by the next theorem do not have this defect. 

1.4 THEOREM. Let Ji, . . . , Jn be N non empty subsets of {1, . . . , n} For each i € {1, . . . , n}, 

let p(i) denote the number of the sets J±, . . . , Jn that contain i, and let p denote the minimum of 
the p(i). Then, for self-adjoint operators Hj on K-j., j = 1, . . . , N, 



Tr exp 



A? 



A 



<II( T ^ eqHj f 9 

i=i 



(1.8) 



for q = p (and hence all 1 < q < p), while for all q > p, it is possible for the left hand side to be 
infinite, while the right hand side is finite. 

Note that in Theorem 11.41 the constant C in Definition (jl.2p is 1. The fact that the constant 
C = 1 is best possible, and that the inequality cannot hold for q > p = min{p(l), . . . ,p(N)} is easy 
to see by considering the case that each Ttj has finite dimension dj, and Hj = for each j. Then 





N 




^exp 




) 









3=1 



and 



N 

n 



A' 



nn4 /4 =n« 



L 3 



j=i keJj 



3=1 



We will prove the inequality (jl.8|) for q = p in Section [3l 

As an example, consider the case in which n = 6, iV = 3 and 



Ji = {1,2,3} J 2 = {3,4,5} and J 3 = {5,6,1} 



Here, p = 1, and hence 





3 




^exp 




) 









<ri( T ^) • 

3=1 



The inequality (|1.9jl can obviously be extended to larger tensor products, and has an interesting sta- 
tistical mechanical interpretation as a bound on the partition function of a collection of interacting 
spins in terms of a product of partition functions of simple constituent sub-systems. 

To estimate the left side of (jl.9p without using Theorem 11.41 one might use the Golden- 
Thompson inequality and then Schwarz's inequality to write 



Tr I exp 



3=1 



< Tr ( e MHi)+MH 3 ) e MH2)\ < / Xr e 2[MHi)+MH 3 )]) 1/2 ( Tr e 2<fe(H 2 )\ 1/2 
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While the L 2 norms are an improvement over the L 1 norms in (jl.9p . the traces are now over the 
entire tensor product space. Thus, for example, 

1/2 



Tr e 2 ^ H A ' = (d 1 d 2 d 6 ) 1 / 2 (TV J2 e 2 ^) 1/2 



where dj is the dimension of Hilbert space Tij. This dimension dependence may be unfavorable if 
any of the dimensions is large. 

1.4 A generalized Young's inequality in Clifford algebras 

Our next example concerns Clifford algebras, which as Segal emphasized |31j . allow one to represent 
Fermion Fock space as an L 2 space - albeit a non-commutative L 2 space, but still with many of the 
advantages of having a Hilbert space represented as a function space, as in the usual Schrodinger 
representation in quantum mechanics. 

In the finite dimensional setting, with n degrees of freedom, one starts with n operators 
Q\i ■ ■ ■ > Qn on some Hilbert space TL that satisfy the canonical anticommutation relations 

QiQj ~t~ QjQi — ' 

One can concretely construct such operators acting on 7i = (C 2 )® n , the n-fold tensor product of 
C 2 with itself; see [BJ. The Clifford algebra £ is the operator algebra on 7i that is generated by 

Ql j • • • ) Qn- 

The Clifford algebra <t itself is 2 n dimensional. In fact, let a = (a±, . . . ,a n ) be a Fermionic 
multi-index, which means that each otj is either or 1. Then define 

Q a = QTQT---Qn n ■ (i-io) 

it is easy to see that the 2 n operators Q a are a basis for the Clifford algebra, so that any operator 
A in £ has a unique expression 

A — ^ ^ x a Q 

a 

The linear functional r on £ is defined by 

r = x (o,..,o) • (1-11) 

That is, r acting on A picks off the coefficient of the identity in A = VJ Q x a Q a . It turns out that 
when the Clifford algebra is constructed in the way described here, as an algebra operators on the 
2 n dimensional space 7i, r is nothing other than the normalized trace: 

r(A) = ^Tt h (A) . 

Hence r is a positive linear functional, and ((C 2 )® n ,£, r) is a non commutative integration space 
in the sense of Segal. 

Clifford algebras have infinitely many subalgebras that are also Clifford algebras of lower di- 
mension. This is in contrast to the setting described in Example 11.31 in which the only natural 
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subalgebras are the 2 n — 1 subalgebras corresponding to the 2™ — 1 non empty subsets of the index 
set {1, . . . , n}. 

To describe these subalgebras, let J be the canonical injection of W 1 into £, which is given by 

n 

J((x 1 ,...,x n )) = J2 x jQj ■ ( L12 ) 

J'=l 

If x and y are any two vectors in M n , it is easy to see from the canonical anticommutation relations 
that 

(J(x))(J(y)) = 2(x -y)I . 

Hence if V is any m dimensional subspace of W 1 , and {ui, . . . , u m } is any orthonormal basis for V, 
the m operators 

J(ui),. . . ,J(u m ) 

again satisfy the canonical anticommutation relations, and generate a subalgebra of <£ that is 
denoted by <£(V), and referred to as the Clifford algebra over V . In the same vein, it is convenient 
to refer to £ itself as the Clifford algebra over W 1 . Obviously, <£(V) is naturally isomorphic to 
£(]R m ), and for A G <£(V) one may compute t{A) using either the normalized trace r inherited 
from <£, or the normalized trace ry induced by the identification of (£(V) with (£(M m ). 

As Segal emphasized, ((C 2 )®", £, r) is in many way a non-commutative analog of the Gaussian 
measure space (M. n ,j(x)dx) where 

7 ( x ) = -N 2 /2 . ( L13) 

In fact, just as orthogonality implies independence in (W 1 , ^y(x)dx), if V and W are two orthogonal 
subspaces of W 1 , and if A G and 5 G €(W), then 

t(AB) = t(A)t(B) . 

The results we prove here reenforce this analogy. We are now ready to introduce our next example: 

1.5 EXAMPLE. For some n > 1, let 21 be the Clifford algebra over M. n with its usual inner 
product, and let 21 be equipped with its unique tracial state r, which is simply the normalized 
trace. 

For each j = 1, . . . , N, let Vj be a subspace on R n , and let 2lj be C(V^), the Clifford algebra over 
Vj with the inner product Vj inherits from R n . Let 2lj be equipped with its unique tracial state Tj. 
The natural embedding of Vj into M n induces a homomorphism of 2lj into 21, and we define this to 
be 4>j. In this setting, we shall prove 

1.6 THEOREM. Let V U ...,V N be N subspaces ofW 1 , and let % be the Clifford algebra over Vj 
with the inner product Vj inherits from W 1 , and let 2lj be equipped with its unique tracial state Tj. 
Let <pj be the natural homomorphism of 2lj into 21 induced by the natural embedding of Vj into W 1 . 
Then 





N 


\ N 


r ^exp 




] <n^^) 1M 




j =i 
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for all self-adjoint operators Hj E 2lj if and only if 



3 = 1 B 



< Iron . 



(1.15) 



where Pj is the orthogonal projection onto Vj in W n . 

In the special case in which dim(Vj) = 1 for each j, (|1.14p reduces to an interesting inequality 
for the hyperbolic cosine. Indeed, let uj be one of the two unit vectors in Vj. 

Then, with Uj (8) Uj denoting the orthogonal projection onto the span of it,-, (|1.15p reduces to 



- 1 

El 



{Pi 



Uj <g) Uj < Jgn . 



(1.16) 



The greater simplification, however, is that in this case, the space of self-adjoint operators in each 
2lj is just two dimensional, and with J denoting the canonical injection defined in (11.121) . 

Hj = a,jl + bjj(uj) 

for some real numbers a,j and bj. Then 



N 



N 



N 



E ff j= £' 



3=1 



J =1 



j=i 



This operator has exactly two eigenvalues, 







N 






^Z b 3 U 3 






3=1 



with equal multiplicities. 

Likewise, p^-Hj has exactly two eigenvalues pjOj ±Pjbj with equal multiplicities. Hence, in this 
case, (|1.14p reduces to 



N 

c °sh ( EjLi ) < II (coBh^-ftj)) 1 ^ for all {h,...,b N )e 



(1.17) 



3=1 



which, according to the theorem, must hold whenever (|1.16p is satisfied. (The aj's make the same 
contribution to both sides, and may be cancelled away.) Taking the logarithm of both sides, this 
can be rewritten as 



In 



cosh ( J2jL\ bjUj < ^2 — lncosh(j)j6j) for all (b\, . . . , b^) E 

3=1 Pj 



aV 



(1.18) 



and this inequality must hold whenever the unit vectors {ui, . . . ,un} and the positive numbers 
{pi, . . . ,pn} are such that (jl . 161) is satisfied. 

Later on, we shall give an elementary proof of this inequality, and hence an elementary proof of 
Theorem 11.61 when each Vj is one dimensional. Our proof of the other cases is less than elementary, 
and even our elementary proof of (|1.18p is less than direct. 
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2 Subadditivty of Entropy and Generalized Young's Inequalities 

In the examples we have introduced in the previous section, the positive linear functionals A under 
consideration are either traces or normalized traces. Throughout this section, we assume that our 
non commutative integration spaces (H, 21, A) are based on tracial positive linear functionals A. 
That is, we require that for all A, B e 21, 

\{AB) = \{BA) . 

In such a non commutative integration space ("H,2l, A), a probability density is a non negative 
element p of 21 such that X(p) = 1. Indeed, the tracial property of A ensures that 

\{ P A) = \(Ap) = Xip^Ap 1 / 2 ) 

so that A i— > X(pA) is a positive linear functional that is 1 on the identity. 

Now suppose we have N non-commutative integration spaces {Tij , 2tj , Xj ) and W* homomor- 
phism <f)j : %j — > 21. Then these homomorphisms induce maps from the space of probability 
densities on 21 to the spaces of probability densities on the 2lj : 

For any probability density p on (21, A), let pj be the probability density on (21/, Aj) by 

X j { Pj A) = \{p<t> j {A)) 

for all A eSlj. 

For example, in the setting of Example 11.31 PJ 3 is J us t the partial trace of p over ® fcg jc 7ik 
leaving an operator on ^^j. H-k- In the Clifford algebra setting of Example 11.51 pj is simply the 
orthogonal projection of p in L 2 (£, r) onto £(Vj), which is also known as the conditional expectation 
[36] of p given <t(Vj). 

In this section, we are concerned with the relations between the entropies of p and the p\, . . . , pn- 
The entropy of a probability density p, S(p), is defined by 

S(p) = -X(plnp) . 

Evidently, the entropy functional is concave on the set of probability densities. 

2.1 DEFINITION. Given tracial non-commutative integration spaces (7^,21, A) and {7i j , 21,- , Xj ) , 
j = 1, . . . , N, together with W* algebra homomorphisms (j)j : 2lj — > 21, j = 1, . . . , N, and numbers 
1 < P j < oo, j = 1, . . . , N, a generalized subadditivity of entropy inequality holds if there is a finite 
constant C so that 



N 1 

Y^-S{ Pj )>S{p)-\nC 

4=1 Pj 



(2.1) 



for all probability densities p in 21. 

It turns out that for tracial non-commutative integration spaces, generalized subadditivity of 
entropy inequalities and B-L inequalities are dual to one another, just as they are in the commutative 
case |10J, so that if one holds, so does the other, with the same values of p±, . . . ,piy and C. The 
following is in fact a direct non-commutative analog of the main theorem of [10J. 
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2.2 THEOREM. Let {TL, 21, A) and (TLjVlj, Xj), j = 1, . . . , N , be tracial non- commutative integra- 
tion spaces. Let 0,- : %j — >• 21, j = 1, . . . , N be W* algebra homomorphisms. Then for any numbers 
1 < Pj < 0Oj j = 1,...,N, and any finite constant C, the generalized subadditivity of entropy 
inequality 112. 1\) is true for all probability densities p on 21 if and only if the non- commutative B-L 
inequality \1.5\) is true for all self-adjoint Hj £ 2lj, j = 1, . . . ,N, with the same p\, . . . ,pn and the 
same C. 

As a consequence of Theorem 12.21 one strategy for proving a non-commutative B-L inequality 
is to prove the corresponding generalized subadditivity of entropy inequality. We shall see in our 
examples that this is an effective strategy; indeed, this is how we prove Theorems 11.41 and 11.61 

In the current tracial context, the proof of Theorem 12.21 is a direct adaptation of the proof of 
the corresponding result in the context of Lebesgue integration given in [10] . It turns on a well- 
known formula for the Legendre transform of the entropy. For completeness, we give this formula 
in Lemma 12.31 below. Before stating the lemma, it is convenient to extend the definition of S to all 
of 2l sa , the subspace of self-adjoint elements of 21, as follows: 

f -X(A In A) if A > and X(A) = 1, 

S(A) = { { ' ~ K > (2.2) 
I — oo otherwise. 

2.3 LEMMA. Let 21 be *B(TL), the algebra of bounded operators on a separable Hilbert space TL. 
Let X denote either the trace Tr on TL, or, ifTC is finite dimensional, the normalized trace t. Then 
for all A € 2l sa; 

-S(A)= sup {X(AH)~ In (X(e H ))} . (2.3) 

The supremum is an attained maximum if and only if A is a strictly positive probability density, in 
which case it is attained at H if and only if H = In A + cl for some c£R. Consequently, for all 
H G 2U 

\n{X{e H ))= sup {X{AH) + S{A)} . (2.4) 

The supremum is a maximum at all points of the domain of In (A (e^)), in which case it is attained 
only at the single point A = e H /(X(e H )). 

Proof: We consider first the case that A = Tr, and TL has finite dimension d. To see that the 
supremum is oo unless < A < I, let c be any real number, and let u be any unit vector. Then let 
H be c times the orthogonal projection onto u. For this choice of H, 

X(AH) - In (A (e H )) = c{u, Au) - ln(e c + (d - 1)) . 

If {u, Au) < 0, this tends to oo as c tends to — oo. If (u, Au) > 1, this tends to oo as c tends to oo. 
Hence we need only consider < A < I. Next, taking H = cl, c € M, 

A(Aff) - In (A (e^)) = cX(A) - c - ln(d) . 



Unless A (A) = 1, this tends to oo as c tends to oo. Hence we need only consider the case that A is 
a density matrix p. 
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Let p be any density matrix on 7i and let H be any self-adjoint operator such that Tr(e^) < oo. 
Then define the density matrix a by 



a 



Tr(e H ) ' 

Then, by the positivity of the relative entropy, 

Tr(pln p — p In a) > 

with equality if and only if a = p. But by the definition of cx, this reduces to 

Tr(plnp) > Tv(pH) - In (Tr (e H )) , 

with equality if and only if H = In p. From here, there rest is very simple, including the treatment 
of the normalized trace.. ■ 

Petz [26j has shown how to extend Lemma 12.31 to a much more general context, and his result 
can be used to extend the validity Theorem 12.21 beyond the tracial case. However, since the 
examples in which we prove the generalized subadditivity inequality here are tracial, we shall not 
go into this. 

Proof of Theorem [272} Suppose first that the non-commutative B-L inequality (jl.5p holds. Then, 
for any probability density p in 21, and any self-adjoint Hj € 2lj, j = 1, . . . ,N, apply (|2.3p with 
A = p and H = J2f=i ^ji^j) to obtain 



S(p) > xL 


' N 


H 


X ^exp 


' N 

j=i 


N 

i=i 


A ^exp 


" N 

3=1 


)i 


N 

> E A >^-)- ln 


N 
3=1 


UPiH^l/Pi 





N 



3=1 Pj 



InC 



(2.5) 



The first inequality here is (|2.3p . and the second is the non-commutative B-L inequality (jl.5p . 

Now choosing PjHj to maximize Xj(pj[pjHj]) — In (Aj (e^^l)), we get from (|2.3p once more 
that 

HP^Hj]) ~ In (A, (e^^l)) = -S{pj) = X 3 ( Pj ln Pj ) . 



Thus, we have proved (|2.ip with the same pi, . . . ,p^ and C that we had in (|1.5p . 

Next, suppose that (12. lh is true. We shall show that in this case, the non-commutative B-L 
inequality (|1.5p holds with the same pi, ■ ■ ■ ,Pn and C. To do this, let the self-sadjoint operators 
Hi, ... , H]y be given, and define 







" N 






" N 


p = 


X ^exp 


E^'(^) 




exp 


E^i) 

J =1 
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Then by Lemma [27 







" N 






' N 




In 


A ^exp 


E^) 


i 






j + w 

















iV 

3=1 

A' 



i=i Pi 
" 1 

< E — ln I A i (expfe-^i))] + ln C 
3=1 Pj 



(2.6) 



The first inequality is the generalized subadditivity of entropy inequality (|2.ip . and the second is 

Exponentiating both sides of (|2.6[) . we obtain (|1.5p with the same pi, ■ ■ ■ ,pn an d C that we had 
in (EH). ■ 



3 Proof of the generalized subadditivity of entropy inequality for 
tensor products of Hilbert spaces 



The crucial tool that we use in this section is the strong subadditivity of the entropy [24J, which we 
now recall in a formulation that is suited to our purposes. 

Suppose, as in Example 1 1.31 that we are given n separable Hilbert spaces Hi, ■ ■ ■ , H n - As before, 
let /C denote their tensor product, and for any non empty subset J of {1, . . . ,n}, let ICj denote 

For a density matrix p on IC, and any non empty subset J of {1, . . . , n}, define pj = Tr jcp to be 
the density matrix on JCj induced by the natural injection of 53 (JCj) into 53 (/C). As noted above, 
pj is nothing other than the partial trace of p over the complementary product of Hilbert spaces, 

The strong subadditivity of entropy is expressed by the inequality stating that for all nonempty 
J,K C {1,... ,n}, 

S(pj) + S( PK ) > S(p JuK ) + S(pj nK ) ■ (3.1) 

In case J n K = 0, it reduce to the ordinary subadditivity of the entropy, which is the elementary 
inequality 

S(pj) + S( PK ) > S( PJUK ) for JnK = ®. (3.2) 
Combining these, we have 

S(P{1,2}) + S(P{2,3}) + S (P{3,1}) > %l,2,3}) + %2})+%l>3}) 

> 2S , (p{i )2| 3}) > 

(3.3) 
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where the first inequality is the strong subadditivity (|3.ip and the second is the ordinary subaddi- 
tivity (|3.2p . Thus, for n = 3 and J\ = {1, 2}, J2 = {2, 3} and J3 = {3, 1}, we obtain 



In this example, each index i € {1, 1, 3} belonged to exactly two of the set J\, J2 and J3, and this 
is the source of the facto of 1/2 in the inequality. The same procedure leads to the following result: 

3.1 THEOREM. Let J 1; . . . , Jn be N non empty subsets of {1, . . . ,n} For each i € {1, . . . , n}, 

let p(i) denote the number of the sets J±, . . . , Jn that contain i, and let p denote the minimum of 
thep(i). Then 



for all density matrices p on K, = Ti\ <8> • • • ® TL n . 

Proof: Simply use strong subadditivty to combine overlapping sets to produce as many "complete" 
sets as possible, as in the example above. Clearly, there can be no more than p of these. If p{i) > p 
for some indices i, there will be "left over" partial sets. The entropy is always non negative, and 
therefore, discarding the corresponding entropies gives us ^2f = i S(pj j ) > pS(p), and hence the 
inequality. ■ 

It is now a very simple matter to prove Theorem 11.41 
Proof of Theorem II. 4t By the remarks made after the statement of the theorem, all that remains 
to be proved is the inequality (II. 8p for q = p. However, this follows directly from Theorem 12.21 and 
Theorem 13.11 ■ 

4 On the generalized Young's inequality with a Gaussian reference 



Before turning to the proof of our non-commutative B-L inequality in Clifford algebras, we discuss 
the commutative case in which the reference measures is Gaussian. We do this here for two reasons: 
First, as noted, a Clifford algebra £ with its normalized trace r is a non commutative analog of 
a Gaussian measure space. This analogy is strong enough that we shall be able to pattern our 
analysis in the Clifford algebra case on an analysis of the Gaussian case. 

Second, the Gaussian inequality is of interest in itself, and seems not to have been fully studied 
before. Suppose that Vx, . . . , V/v are N non zero subspaces of R n , and for each j, define = Pj to 
be the orthogonal projection of M n onto Vj. Equip W 1 and equip each Vj with Lebesgue measure. 
Then the problem of determining for which sets of indices {p±, . . . , pjy} there exists a finite constant 
C so that (jl.3p holds for all non-negative measurable functions fj on Vj, j = 1, . . . , N is highly 
non trivial, and has only recently been fully solved 0[8]. Moreover, determining the value of the 
best constant C for those choices of {pi, . . . ,pjsr} is still a challenging finite dimensional variational 
problem for which there is no general explicit solution. 

In contrast, suppose we are given a non-degenerate Gaussian measure on R n . It will be con- 
venient to take the covariance matrix of the Gaussian to define the inner product, so that the 



-5>(pj,)><%) • 




(3.4) 



measure 
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Gaussian becomes a unit Gaussian. For each positive integer m, define 7 m (x) = (2ir)~ m ^ 2 e~^ 2 ^ 2 
on W 71 . Then equipping W 1 with the measure j n (x)dx and equipping each Vj with the 7^ (x)dx, 
being the dimension of Vj, it turns out that there is a very simple necessary and sufficient condition 
on the indices {p±, . . . ,pn} for the constant C to be finite, and better yet, the best constant C is 
always 1 whenever it is finite: 

4.1 THEOREM. Let V\, . . . , V/v be N non zero subspaces ofW 1 , and for each j, and let dj denote 
the dimension of Vj . Define Pj to be the orthogonal projection of R n onto Vj . Given the numbers 
Pj, 1 < Pj < 00 for j = 1, . . . , N, there exists a finite constant C such that 




holds for all non-negative fj on Vj, j = 1, . . . , N, if and only if 

N 1 

~ P 3 < Id K" (4.2) 
j=i Pi 

and in this case, C = 1. 



We hasten to point out that this theorem is partially known. In the special case that each of 
the subspaces Vj is one dimensional, Barthe and Cordero-Erausquin [2] , have the sufficiency of the 
condition (|4.2p which reduces to 

N 1 

'/ j — Uj <S> Uj = IdRn (4.3) 
3=1 Pj ' 

with each Uj being a unit vector spanning Vj. They did this as an intermediate step in a short 
proof of the Lebesgue measure version of the B-L inequality under the condition (|4.3p - the so-called 
geometric case. Perhaps because their main focus was the Lebesgue measure case, in which (|4.3p 
is not a necessary condition for finiteness of the constant C, they did not address the necessity of 
this condition in the Gaussian case. 

Indeed, the inequality (|4.ip is equivalent to its Lebesgue measure analog, which is known to 
hold with the constant C = 1 under the condition (|4.2|) Uj. To see this, define g%, . . . by 

9j{y) = fM{iM) 1/di j = i,...,N. 

As noted in [2j , this change of variable allows one to pass back and forth between the Gaussian and 
Lebesgue measure version of the B-L inequality - under the condition (|1.15p . 

Nonetheless, it is worthwhile to give a proof of Theorem 14.11 here for two reasons: First, it 
may be surprising that the condition (|1.15p is necessary for the inequality to hold with any finite 
constant at all. Second, the proof we will give of sufficiency of the condition (|1.15p serves as a 
model for the proof of the corresponding theorem in the Clifford algebra case that we consider in 
the next section. 

In proving Theorem I4.1|, our first step is to pass to the problem of proving a generalized sub- 
additivity inequality. Because the commutative version of Theorem 12.21 has been proved in [10] , 
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Theorem 14.21 theorem below on subbadditivity of entropy with respect to a Gaussian reference 
measure is equivalent to Theorem 14.11 Hence, it suffices to prove one of the other. 

Before stating and proving the subadditivty theorem, we first recall that for any probability 
density p on (M m ,d7 m ), the entropy of p, is defined by 



Note that the relative entropy of p(y)^ m (y)dy to r y m (y)dy is —S(p); in the convention employed 
here, the entropy S is concave, and the relative entropy is convex. 

4.2 THEOREM. Let V\, . . . , V/v be N non zero subspaces o/R n , and for each j, and let dj denote 
the dimension ofVj. Define Pj to be the orthogonal projection ofM. n onto Vj. For any probability 
density p on (M n ,d7 n ), let py } denote the marginal density on (Vj , d^dj) ■ Then, given the numbers 
Pj, 1 < Pj < oo for j = 1, . . . , N, there exists a finite constant C such that 






(4.4) 



holds for all probability densities p on (M n ,d7 ra ), if an only if 




(4.5) 



and in this case, ln(C) = 0. 



We first prove necessity of the condition (|4.5[) : 



4.3 LEMMA. The condition l4-5\ ) in Theorem \4-2\ is necessary. 



Proof: It suffices to consider densities of the form 



p(x) 



exp(6 • x - \b\ 2 /2) 



for b £ M. n . Then 



PVj (x) 



exp(Pjb 



y-m 2 /2) 



and we compute: 



2 



Sip) 



2 



and 



S( PVj ) 



2 



Thus 




and evidently this is bounded below if and only if (|4.5|) is satisfied. 
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4.1 Proof of sufficiency 

The sufficiency of the condition (|4.5p will be proved using an interpolation between an arbitrary 
density p and the uniform density that is provided by the Mehler semigroup. (Indeed, Barthe and 
Coredero-Erausquin used the Mehler semigroup in their work [2] mentioned above, but in a direct 
proof of the Gaussian B-L inequality inspired by the heat-flow method introduced in [13] . The heat 
flow approach to prove subadditivity inequalities was developed in [3] and [TO].) 

The Mehler semigroup is the strongly continuous semigroup of positivity preserving contractions 
on L 2 (M n , 7 n (x)dx) whose generator —M is given by the Dirichlet form 

S(f,g)= f Vf(x)-Vg{x) ln {x)&x (4.6) 

through (/, Mg)ijii ln \ = £(f,g), where /* is the complex conjugate of /. Integrating by parts, one 
finds 

M = -(A-x-V) , 

The eigenvalues of M are the non-negative integers, and the eigenfunctions are the Hermite poly- 
nomials. (In certain physical contexts, the eigenvalues count occupancy of quantum state and M 
is called the Boson number operator.) 

There is a simple explicit formula for the e~'^: 

e~ W f(x) = [ f (e-'x + sfl - e*y) ln {y)dy , (4.7) 
JR" v 7 

which is easily checked. 

Since evidently Ml = 0, and e~ 1 ^ is self-adjoint, it also preserves integrals against 7 n (x)dx, 
and hence, if p is any probability density, so is each pt := e -t/v '. As one sees from (|4.7p . 

lim e~ W p(x) = 1 , (4.8) 

t— »oo 

the uniform probability density on (W l , r y n (x)dx), and thus the Mehler semigroup provides us with 
an interpolation between any probability density p and the uniform density 1. 

This interpolation is well-behaved with respect to the operation of taking marginals: Consider 
any probability density p on (R™, 7„(x)dx), and any m dimensional subspace V of M n . Let py be 
the marginal density of p as in Theorem 14.21 Then of course, we may regard py as a probability 
density on (W n ,~/ n (x)dx) that is constant along directions in V^. (Simply compose py with Py.) 
Interpreted this way, so that both p and py are functions on M. N , 

(e-^p) v = e-^{p v ) . (4.9) 

That is, taking marginals commutes with the action of the Mehler semigroup. 

The next point to note is that the entropy is monotone increasing along this interpolation: 
Differentiating, with pt = e~ tJ ^ p, 
d 



dt 



S(p t ) = - ln(p t )(A - x ■ V)pa n dx = I V In p t ■ Vpa n dx = £(lnp t ,p t ) . 

il™ JR" 



For any smooth density p, £ (lap, p) = L„ Vlnp • V prf n dx = J" Rn |V hi p\ 2 prf n dx , and hence S(pt) 
is strictly increasing for all t. Moreover, since (x,t) t— » |x| 2 /t is jointly convex on W 1 x R + , 
p i— ► £ (lnp, p) has a unique extension as a convex functional the set of all probability densities on 

(M n , 7n (x)dx). 
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4.4 DEFINITION (Entropy Production). The entropy production functional is the convex func- 
tional D{p) on probability densities on (M n , r y n {x)dx) given by 



D(p) = \np{x)Mp{x)>y n {x)dx = £{\np,p) . (4.10) 

With this definition, 

p(e~ W p) = Die-^p) . 
Now because of (|4.9p . for any subspace V of IR n , 

±S([e**p]v) = D([e-^p] v ) . 

Now, since [e~ tJ ^ p]y is constant along directions orthogonal to V, the derivatives in those directions 
that figure in D^e -1 ^ p]y) are irrelevant; we need only take derivatives along directions in V. This 
consideration leads to the definitions of the restricted number operator, and the restricted entropy 
production: 

Given an m dimensional subspace V of R n , let Py be the orthogonal projection onto V. The 
restricted number operator My is the self-adjoint operator on L 2 (R n , ^ n {x)dx) defined through 



(f,M V g) L ^ n) = / V/*(x) • Py\7g(x) ln (x)dx , (4.11) 
and the restricted entropy production functional Dy (p) is the convex functional given by 

Dy(p)= f {My\np{x))p{x) ln {x)dx . (4.12) 



With this definition, D(py) = Dy(py), however, there is a crucial difference between Dy(p) 
and D(py): 

4.5 LEMMA. For any smooth probability density p on (]R n , -y n (x)dx), and any non m dimensional 
subspace V ofW 1 , let py be the corresponding marginal density regarded as a probability density on 
(M n ,7„(x)dx). Then 

D{py) < Dy(p) . (4.13) 

Proof: Regard py as a function on W 1 (by composing it with Py). Assume that p is smooth and 
bounded above and below by strictly positive numbers. Notice that since py is constant constant 
along directions in V- 1 , 

M In py = My In py , 

and hence 

Then, integrating by parts, and using the definition of py and the Schwarz inequality, we obtain: 
D (Pv)= / [Mylnpy(x)} py(x)j n (x)dx = / [My lnp v (x)] p(x)~f n (x)dx , 



where we have used the definition of py to replace the second py be p itself. Then, by the definition 
of My, and the Schwarz inequality, 
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D(pv) 



< 



(V\npv(x)) ■ PyV p(x)^ n dx 

(V lnpv(x)) ■ Py (V lnp(x)) p(x)^ n (x)dx 

.1/2 

|Vln pv(x)\ z p{x)^ n {x)dx 



/ \PyVhip(x)\ 2 pj n dx ) 



\ 1/2 



(4.14) 



In the first factor in the last line, we may replace p by /5y since jVlnpvKx)! depends on x only 
through Pyx. Hence this factor is simply yj D(py), and the second factor is \J Dy(p). ■ 
The proof we have just given is patterned on the proof of an analogous result in the Lebesgue 
measure case in [10J, which in turn is based on similar arguments in [9] and [3]. It is somewhat 
more complicated to adapt the argument to the Clifford algebra setting, but this is what we shall 
do in the next section. We are now ready to prove the sufficiency of condition (|4,5p : 



4.6 LEMMA. The condition |^,5| ) in Theorem \4-S\ is sufficient. 
Proof: For a probability density p on (M n ,d7 n ) S(p) > —oo, it is easy to see that 

lim S{e~ W p) = 5(1) = 

t— »oo 

and hence, lim^oo S (e~ tJ ^ (pyA) = for each j = 1, . . . , N. Therefore, it suffices to show that 



o(t) := 



N 



j=i Pj 



is monotone decreasing in t. 

Differentiating, and using (|4.9p . and then Lemma |4.5 



dt a{t) 



N 



< 



<£l.D{(e-*»p)y j )-D 
3=x Pi 

N i 

Y / -D Vj (e-^p)-D(e-^p) 

3=1 Pj 



(4.15) 



Now note that by (I4.12p . whenever (|4.5j) is satisfied, 



N 



£lzv>)<z%) 

j=l V3 



for any smooth density a. Hence the derivative of a(t) is negative for all t > 0. 
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5 Generalized subadditivity of the entropy in Clifford algebras 

In this section we shall prove 

5.1 THEOREM. Let V U ...,V N be N subspaces ofW 1 , and let Qlj be the Clifford algebra over Vj 
with the inner product Vj inherits from W 1 , and let 2lj be equipped with its unique tracial state tj . For 
any probability density p € 21, let py i be the induced probability density in 2lj. Let S(p) = r(plnp) 
and S{p Vj ) = Tj(p Vj lnp Vj ) 

Then 

N 1 

3=1 Pj 

for all probability densities p £ 21 if and only if 

N 1 

52-Pj<I*n. (5.2) 
3=1 V3 

where Pj is the orthogonal projection onto Vj in M. n . 

Granted this result, we have: 
Proof of Theorem II. 6t Theorem 12.21 and Theorem 15.11 together prove Theorem 11.61 ■ 
We shall now prove Theorem 15.11 As before, we begin by proving the necessity of (|5.2p . 

5.2 LEMMA. The condition 15. Sty in Theorem \5.1\ is necessary. 
Proof: For any vector a = (oi, . . . , a n ) € W\ define 

n 

Pa = I + ^ ajQj = L + a - Q . 

3=1 

Then p a is a probability density if and only if \a\ < 1. Indeed, p a has only two eigenvalues, 1 ± \a\, 
with equal multiplicity. 

Then (p a )vj = I + (Pj a ) ' Qi an d so (p a )vj nas on ly t wo eigenvalues, 1 ± |P/o|, with equal 
multiplicity. Therefore, 

S(p a ) = and S((p a ) Vj ) = -^(\Pja\) . (5.3) 

where ip(x) is the convex function defined by 

r|[(l + x)ln(l + x) + (l-x)ln(l-x)] if|x|<l. 
ip(x) = < (5.4) 
[oo otherwise, 

Thus, for (|5.ip to hold for each p a , \a\ < 1, it must be the case that 
- 1 

^2 — i/;(\Pja\) < ip(\a\) for all a with \a\ < 1 . (5.5) 
3=l Pj 

Then since ^(aj) = x 2 + 0(x 4 ), replacing a by ta, < t < 1, we see that (|5.2p must hold. ■ 
Because of (|5.3|) . once we have proved Theorem 15. 1\ we will have a proof of (|5.5p . However, it 
is of interest to have a direct proof of this inequality. 
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5.3 PROPOSITION. The inequality $5.5\) holds whenever \5.2i) is satisfied. 
Proof: An easy calculation of derivatives shows that 

1 



1 — x 2 



ip'(x) = arctanh(x) and ip"(x) 
for \x\ < 1. 

Now fix any a with \a\ < 1. Then, for t > 0, define 

N 1 

v (t)=i;(e- t \a\)-Y,-^e- t \P 3 a\) . 
3=1 Pj 

We have to show that rj(t) > for all t > 0. Since evidently lim^oo rj(t) = 0, it suffices to show 
that rf{t) < for all t > 0. 
Differentiating, we find 



N 



\a\ arctanh(e *|a|) — — \Pj a \ arctanh(e t \Pjt 
i=l Pj 



:= e - *0(t) . 



Hence, it suffices to show that 0(t) > for all t > 0. Since once again, linit^ oo 0(t) = 0, it 
suffices to show that 0'(t) < for alH > 0. Differentiating, we find 



\a 



N 



1 



1 — e~ 2t \a\ 2 ^— ' p-; 1 — e~ 2t \Pja\ 

3=1 J 



Multiplying through by e *, and absorbing a factor of e * into a, it suffices to show that 

.12 



> 



(5.6) 



for all \a\ < 1. However, since |a| > |ijo|, 

iPfttl 



> 



l-Pjal 



1 - \a\ 2 ~ 1 - \P ja \ 2 ' 

and thus (|5.6p follows from (|5.2p . ■ 
We are now in a position to give an elementary proof of Theorem ll.6l in the special case that each 
Vj is one dimensional. As explained in Example 11.5} it suffices in this case to prove the following: 

5.4 PROPOSITION. Suppose {ui, . . . ,ujy} is any set of N unit vectors in W 1 , and {p\, . . . ,pn} 
is any set of N positive numbers such that 



N 



3=1 



(5.7) 



Then for any b = (b\, . . . , bj\f ) inR N , 

In cosh 



N 



3=1 3 



(5.8) 
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Proof: Let ifj*(x) denote the function ip*(x) = lncosh(x), The notation is meant to indicate 

the well known fact, easily checked, that tp* is the Legendre transform of the function ip defined in 

(S3]). 

Now, given a set of N orthogonal projections {Pi, . . . , P/v} satisfying (|5.2p . we may make any 
choice of a unit vector Uj from the range of Pj, and then the N unit vectors {u\, . . . ,un} will 
satisfy (|5.7j) . Conversely, given any set of N unit vectors {ui, ... ,u^} that satisfy (|5.7j) . we may 
take Pj = Uj®iij, and then (|5.2p is satisfied. Hence, we suppose we are given a a set of N orthogonal 
projections {Pi, . . . , P/v} satisfying (|5.2|) . and for each j, itj is a unit vector in the range of Pj. 

Then for any b G M n , 



sup < 



j'=i 



A' 



sup < Pja ■ bjUj — tp(\a\) 



|a|<l I i=l 



< sup < 

|o|<l 



A' 



A 



3=1 Pj 



3=1 



( N N \ 

< sup {J2\Pja\\bj\-J2-^\ P j a \U 
\a\<l( j=1 j=1 Pj J 



(5.9) 



where the first inequality is from (|5.5p . and the second is from Schwarz. Then, by the definition of 
the Legendre transform, for any a, 

riPjbj) > \PjaKpj\bj\) - iP(\P ja \) , 



A 



we obtain 

( EjLi bjUj ) <J2 —^fabj) , 
3=1 Pj 

which is dEBD. ■ 
We now prove Theorem 15.11 in full generality. This gives another proof of the last two propo- 
sitions, but by less elementary means. The proof will follow the basic pattern of the proof of 
Theorem 14.21 an d use the Clifford algebra analog of the Mehler semigroup. This is the so-called 
Clifford-Mehler semigroup, about which we now recall a few relevant facts. 



5.1 About the Clifford— Mehler semigroup 

There is also a differential calculus in the Clifford algebra. Let Q\, 
for the Clifford algebra <£ over M. n . For A G £, define 

V l {A) = \[Q i A-T(A)Q i ] , 



, Q n be any set of n generators 
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where T is the grading operator on <L: That is, using the notation in (|1.10p . 

r(Q a ) = . 

One computes that 'Vi(Q a ) = is a(i) = 0, and otherwise, Vj(Q a ) = is what one gets by 
anti-commuting the factor of Qi through to the left, and then removing it. In this sense it is like 
a differentiation operator, and what is more, it is a skew derivation on (£, which means that for all 
and A and B in £, Vj(AB) = Vj(A)B + T(A)Vj(B). 

The Clifford algebra analog of the Gaussian energy integral (|4.6p is given by 

£(A,B)=r fev^Vj-sj , (5.10) 

for all A, B € <£. This is the Clifford Dirichlet form studied by Gross. Then, the Fermionic number 
operator, also denoted /V, is defined by 

£(A, B) = r(A*J\f(B)) . 

It is easy to see that the spectrum of TV consists of the non negative integers {0, 1, . . . , n} and that 

MQ a = \a\Q a . (5.11) 

The Clifford Mehler semigroup is then given by e~ 1 ^ ' . It is clear from this definition, (jl.lip and 
(|5.1ip that for any A € <t, lim^oo e~ 1 ^ (A) = t(A)I. Thus for any probability density p in £, 

t^p t = e~ W (p) 

provides an interpolation between p and /, and each pt is a probability density. This corresponds 
exactly to the Mehler semigroup interpolation that was used to prove Theorem 14.21 an d we shall 
use it here in the same way, though some additional complications shall arise. 

N does not depend on the choice of the set of generators Qi, . . . , Q n . Indeed, if {ui, . . . ,u n } is 
any orthonormal basis of M n , and we define Qj = Uj-Q j = 1, . . . , n, then the Clifford Dirichlet 
form that one obtains using this basis to define the derivatives is the same as the original. 

In particular, given an m dimensional subspace V of W 1 , we may choose {u\, . . . ,u n } so that 
{u\, . . . , u m } is an orthonormal basis for V, and then the first m generators will be a set of generators 
for £ v ■ We then define the reduced Clifford Dirichlet form Ey by 

£ v (A,B)=T^pV i A*[P v ] i!j V j B S j , (5.12) 

where [-FV]i,j is the i, j'th entry of the n x n matrix for Py. The restricted number operator My is 
then the self-adjoint operator on L 2 (£) given by r{A*Ny{B)) = £y(A,B). 

Now, for any probability density p in £ let py be the corresponding marginal density regarded 
as an operator in £ by identifying it with (py(py), where <f>y is the canonical embedding of <t(V) 
into £(R n ). Then it is an easy consequence of the definitions that 

(e- W p) v = e~ tM (py) = (py) . (5.13) 
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Also, under the condition (|5.2p . it is easy to see that 

- 1 

E v Mv > - M • (5 - 14) 
j= i n 

Finally, we introduce entropy production D(p): With p t := e~ 1 ^ p ,we differentiate and find 

±S(pt) =r(ln(p t )M(p t )) = £{HPt),Pt) ■ 

5.5 DEFINITION (Entropy Production). The entropy production functional at a probability 
density p is the functional defined by 

D( P ) = r (HpW(p)) = £(Hp),p) • (5-15) 

Given an m dimensional subspace V of R n , the restricted entropy production functional at a 
probability density p is the functional defined by 

D v (p) = T(\n{p)M v (p)) = Sv(Hp),p) . (5.16) 



The following lemma is the basis of our proof of the sufficiency of (|5.2p . In the course of proving 
it, we shall see that both D(p) and Dy{p) are convex functionals, which is somewhat less obvious 
than in the Gaussian case. 

5.6 LEMMA. For any any probability density p in £(M n ), and any m dimensional subspace V 
of M. n , let pv be the corresponding marginal probability density regarded as a probability density in 
e(R n ). Then 

D(pv) < D v (p) . 

Proof: We choose an orthonormal basis {ui, . . . , u n } for M n such that {u±, . . . , u m } is an orthonor- 
mal basis for V. Without loss of generality, we may suppose that {u\, . . . , u n } is the standard basis 
so that {Qi, . . . , Q m } is a set of generators for <E(V). Then, 

8{A,B) = T^Y^V j A*V j B^ and £ V {A, B) = t VjA*VjB^j . (5.17) 

It will be convenient to define A/} = V*Vj j = 1, . . . , n. Then we have 

n m 

Af=Y J J^ j and M v = ^Mj, (5.18) 

3=1 3=1 

and so 

m 

D v (p) = ^ t (In p,M jP ) . (5.19) 

3=1 

{Q a if a(j) = 1, 
' , each J\fj is an orthogonal projection, and so (15.191) can be rewrit- 
a(j)=0, 

ten as 

m 

D v (p) = Y,r(M 3 {\np),N 3 p) ■ (5.20) 

3=1 
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To proceed, we use a formula of Gross [18] for ftfjf(A) where A € £(M n ), and / is a continuous 
function. To write down Gross's formula, first write A = B + QjC where both B and C are linear 
combinations of the Q a with a(j) = 0. Then define A = B — QjC. Notice that if p is a probability 
density, then p is again a probability density. Gross's formula is 

Njf{A) = \ [f(A) - f(A ) 

To prove this formula, notice that there is a unitary operator U such that A = UAU*. (If the 
dimension n is odd, one can take U to be the product, in some order, of all of the Qk for k ^ j; if 
the dimension is even, one can add another generator.) Therefore, 

7(A) = Uf{A)U* = f(UAU*) =f(A). 

Using this together with the fact that for any A € 21, AfjA = (1/2) [A — A], we obtain Gross's 
formula, which we now apply as follows: 

T{Nj{hi P )M jP ) = \r([H P )-ln(p)}[p-p\) 

= \r (Hp) \ P - P \) + \r (Hp) \p-p\) 

= ±H[p\p] + ~H[p\p] 

(5.21) 

where -ff[p|<r] = rp( Inp — lnu) is the relative entropy of p with respect to a. As is well known, 
(p, a) i— > H(p\o~) is jointly convex, and hence 

p ^ T({\np)Mjp) 

is convex. Furthermore, by the fundamental monotonicity property of the relative entropy under 
conditional expectations [35] , 

H(p V \av) <H(p\a) 

for any two probability densities p and a. It follows that r ((In P v)Nj P v) < t ((In P )Mj P ) . Summing 
on j from 1 to m, we find 

m m 

D(p v ) = D v (p v ) = Y j r({\n Pv )M ] pv) < 5>((lnp)%>) = D v (p) . 
i=i i=i 



5.2 Proof of the sufficiency 

5.7 LEMMA. The condition |^.5[ ) in Theorem \4.2\ is sufficient. 
Proof: For a probability density p in U^M™) it is easy to see that 

lim S(e~ w p) = 5(1) = 

t— >oo 
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and hence, lim^oo S(e tJ ^ {pv,)) = for each j = 1, . . . ,N. Therefore, it suffices to show that 



a(t) r- 



N 



J2±S(e-^ PVj )-S(e-^ P ) 
i=i Pj 



is monotone decreasing in t. 

Differentiating, and using (|5.13p . and then Lemma [57 



dt 



a(t) 



N 



^L D{{e -w p)V]) _ D{e -w p) 

3=1 Pj 



< 



N 



J2±-D Vj (e-^ P )-D(e-^ P ) 

3=1 P > 



N 



(5.22) 



Now note that by (|4.12p . whenever (|4.5p is satisfied, — ^v, (°") < D(a) for any smooth density 

3=l Pj 

a. Hence the derivative of a(t) is negative for all t > 0. ■ 

Notice that the proof is almost identical, symbol for symbol, with that of the corresponding 
proof in the Gaussian case. The main difference of course is that the proof of the main lemma, 
Lemma 15.61 is considerably more intricate than that of its Gaussian counterpart. 
Proof of Theorem I5.lt This now follows immediately from Lemma 15.21 and 15.71 ■ 



References 

[1] [4] F. Barthe, On a reverse form of the Bras camp -Lieb inequality, Invent. Math. 134 (1998), no. 2, pp. 335-361. 

[2] F. Barthe and D. Cordero- Erausquin Inverse Brascamp-Lieb inequalities along the Heat equation, in Geometric 
Aspects of Functional Analysis (2002-2003), LNM 1850, Springer, 2004, pp. 65-71. 

[3] F. Barthe, D. Cordero-Erausquin and B. Maurey: Entropy of spherical marginals and related inequalities, J. 
Math. Pures Appl. 86, no. 2, 2006 pp. 89-99. 

[4] W. Beckner, Inequalities in Fourier series, Ann. of Math. 102, 1975, pp. 159-182. 

[5] H.J. Brascamp and E.H. Lieb: The best constant in Young's inequality and its generalization to more than three 
functions, Adv. in Math. 20, 1976, pp. 151-173 

[6] R. Brauer, R., H Weyl: Spinors in n dimensions, Am. Jour. Math., 57, 1935 pp. 425-449. 

[7] J. Bennett, A. Carbery, M. Christ and T. Tao: The Brascamp-Lieb inequalities: finiteness, structure, and 
extremals, Preprint. 

[8] J. Bennett, A. Carbery, M. Christ and T. Tao: Finite bounds for Holder-Brascamp-Lieb multilinear inequalities, 
Preprint. 

[9] E.A. Carlen: Superadditivity of Fisher information and logarithmic Sobolev inequalities, Jour, of Func. Analysis, 
101, 1991, pp. 194-211. 



EACEHL 



26 



[10] E.A. Carlen and D Cordero-Erausquin: Subadditivity of the entropy and its relation to Brascamp-Lieb type 
inequalities, to appear in Geom. Funct. Anal. (2008). 

[11] E.A. Carlen and E.H. Lieb: Optimal Hypercontractivity for Fermi Fields and Related N on- commutative Inte- 
gration Inequalities, Comm. Math. Phys., 155, 1993 pp. 27-46. 

[12] E.A. Carlen and E.H. Lieb: Optimal two-uniform convexity and fermion hypercontractivity, Proceedings of the 
Kyoto Conference in Honor of Araki's 60th birthday, Kluwer Academic Publishers, Amsterdam, 1993 

[13] E.A. Carlen, E.H. Lieb and M. Loss: A sharp form of Young's inequality on S N and related entropy inequalities, 
Jour. Geom. Analysis 14, 2004, pp. 487-520 

[14] E.A. Carlen, E.H. Lieb and M. Loss: A inequality of Hadamard type for permanents, Meth. and Appl. of 
Analysis, 13, no. 1, 2006 pp. 1-17 

[15] J. Dixmier: Formes lineaires sur un anneau d'operateurs, Bull. Soc. Math. France 81, 1953, pp. 222-245 

[16] T. Fak and H. Kosaki: Generalized s-numbers of r-measurable operators. Pacific J. Math., 123, 1986, pp. 
269-300, 

[17] S. Golden: Lower bounds for Helmholtz functions. Phys. Rev. 137B, 1965, pp. 1127-1128 

[18] L. Gross: Hypercontractivity and logarithmic Sobolev inequalities for the Cliff ord-Dirichlet form, Duke Math. J., 
43, 1975, pp. 383-396 

[19] U. Haagerup: LP spaces associated with an arbitrary von Neumann algebra, Algebras d'opeerateurs et leurs 
applications en physique mathematique (Colloque CNRS, Marseille, juin 1977) Editions du CNRS, Paris, 1979, 
pp. 383-396. 

[20] G.H. Hardy, J.E. Littlewood and G. Polya: Inequalities. Cambridge University Press, Cambridge, 1934 

[21] H. Kosaki: Applications of the complex interpolation method to a von Neumann algebra: noncommutative IP 
-spaces. J. Funct. Anal., 56, 1984, no. 1, pp. 29-78. 

[22] E.H. Lieb: Some convexity and subadditivity properties of entropy, Bull. Amer. Math. Soc. 81, 1975, pp. 1-13. 

[23] E.H. Lieb: Gaussian kernels have only Gaussian maximizers, Invent. Math. 102, 1980, pp. 179-208 

[24] E.H. Lieb and M.B. Ruskai: Proof of the strong subadditivity of quantum-mechanical entropy, J. Math. Phys. 
14, 1973, pp. 1938-1941. 

[25] E. Nelson: Notes on non- commutative integration, Jour. Funct. Analysis, 15, 1974, pp. 103-116 

[26] D. Petz: A variational expression for the relative entropy, Comm. Math. Phys., 114, 1988, pp. 345-349 

[27] G. Pisier. Non- commutative vector valued W -spaces and completely p-summing maps Asterisque, 247, Math. 
Soc. of France, Paris, 1998. 

[28] T. Rockafellar: Conjugate duality and optimization, Vol. 16, Regional conference series in applied mathematics, 
SIAM, Philadelphia, 1974 

[29] M.B. Ruskai, Inequalities for quantum entropy: A review with conditions for equality, J. Math. Phys. 43, 2005, 
pp. 4358-4375 (2002). Erratum ibid 46, pp. 019901 

[30] I.E. Segal: A non-commutative extension of abstract integration, Annals of Math., 57, 1953, pp. 401-457 

[31] I.E. Segal.: Tensor algebras over Hilbert spaces II, Annals of Math., 63, 1956, pp. 160-175 

[32] I.E. Segal.: Algebraic integration theory, Bull. Am. Math. Soc, 71, 1965, no. 3, pp. 419-489 



EACEHL 



27 



[33] W. Stinespring: Integration theorems for gages and duality for unimodular groups, Trans. Am. Math. Soc, ., 
90, 1958, pp. 15-26 

[34] C. Thompson: Inequality with application in statistical mechanics J. Math. Phys. 6, 1812-1813 (1965) 

[35] A. Uhlmann: Relative entropy and the Wigner-Yanase-Dyson-Lieb concavity in an interpolation theory Com- 
mun. Math. Phys., 54, 1977, pp. 21-32 

[36] H. Umegaki: Conditional expectation in operator algebras I, Tohoku Math. J., 6, 1954, pp. 177-181 

[37] D. Voiculescu, K. Dykema and A. Nica: Free random variables, CRM Monograph Series, 1, Am. Math. Soc, 
Providence, R.I., 1992) 

[38] W.H.. Young: On the multiplication of successions of Fourier constants, Proc. Royal soc. A., 97, 1912, pp. 
331-339 

[39] W.H.. Young: Sur la generaliation du theoreme du Parseval, Comptes rendus., 155, 1912, pp. 30-33 



